Skip to content

Add the SEP-2663 Tasks extension (core)#3005

Open
Kludex wants to merge 11 commits into
mainfrom
tasks-extension-sep-2663
Open

Add the SEP-2663 Tasks extension (core)#3005
Kludex wants to merge 11 commits into
mainfrom
tasks-extension-sep-2663

Conversation

@Kludex

@Kludex Kludex commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

The SEP-2663 Tasks extension (io.modelcontextprotocol/tasks) — the conformant core. Built on the extension API from #3003 (this PR is based on that branch; merge after it).

SEP-2663 (Final) is wire-incompatible with the 2025-11-25 in-core Tasks design still carried (types-only) in mcp_types, so the extension defines its own SEP-2663-shaped models.

from mcp.server.mcpserver import MCPServer
from mcp.server.tasks import Tasks

mcp = MCPServer("demo", extensions=[Tasks()])

What's implemented (conformant core)

  • Server-decided augmentation. The server chooses, per request, to defer a tools/call as a task. The legacy params.task field is ignored (it is not the opt-in). Only a client that declared the extension on a modern (2026-07-28) connection is augmented — a legacy handshake cannot carry the capability back, so it is never augmented.
  • Envelope. A task-augmented tools/call returns a flat CreateTaskResult (resultType: "task", taskId/status/createdAt/lastUpdatedAt/ttlMs).
  • tasks/get returns a DetailedTask (resultType: "complete"); a completed task inlines the original CallToolResult. A tool result with isError: true is a completed task (failed is reserved for JSON-RPC errors).
  • tasks/cancel is an empty ack. tasks/result is not registered → -32601 (removed in SEP-2663). A tasks/* call from a non-declaring client → -32003 with a requiredCapabilities payload. Task ids are entropy-bearing bearer capabilities.

Ships a runnable tasks story and a migration note.

Deferred to follow-ups

Each needs deeper SDK plumbing and is called out in the module/README/migration:

  • tasks/update + the MRTR input_required loop
  • ToolExecution.taskSupport gating with the -32021 required-task error
  • notifications/tasks
  • SEP-2243 task routing headers

These map to the remaining conformance tasks-* scenarios; the core targets tasks-dispatch-and-envelope, tasks-capability-negotiation, tasks-wire-fields, tasks-lifecycle (partial), tasks-request-state-removal.

Testing

14 spec-derived in-memory Client(server) tests, 100% coverage of tasks.py, strict-no-cover clean, pyright + ruff + markdownlint green, both story legs (in-memory + http-asgi) pass.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

Kludex added 11 commits June 26, 2026 20:17
Thread an `extensions` argument through the low-level `Server.get_capabilities`
and `create_initialization_options` (mirroring `experimental`), backed by a
`Server.extensions` attribute so the streamable-HTTP `server/discover` path
advertises it too. Add an `extensions` branch to `Connection.check_capability`
(presence-of-identifier, since settings are negotiated per-extension) and let a
client advertise its own support via `Client(extensions=...)` /
`ClientSession(extensions=...)`, mirrored into `ClientCapabilities.extensions`.
Introduce `Extension`, a narrow base class (HTTPX `Transport`/`Auth` style) whose
methods default so an extension overrides only what it needs: `settings()`,
`tools()`, `resources()`, `methods()`, and `intercept_tool_call()`. `MCPServer`
accepts `extensions=[...]` at construction and `add_extension()` later, applying a
closed set of contributions (tool/resource/method bindings) and composing every
extension's `tools/call` interceptor into one `ServerMiddleware`. The server never
hands itself to an extension; the extension declares what it adds as data.
`Apps` is an additive `Extension`: `@apps.tool(resource_uri=...)` binds a tool to a
`ui://` UI resource via `_meta.ui.resourceUri`, `add_html_resource()` serves the
HTML at `text/html;profile=mcp-app`, and `client_supports_apps(ctx)` gates the
SEP-2133 text-only fallback. Drop the now-exercised `# pragma: no cover` on
`TextResource.read()` (the Apps resource path covers it).
`Tasks` is an interceptive `Extension`: `intercept_tool_call` records a
task-augmented `tools/call` and stamps the task id into
`_meta[io.modelcontextprotocol/related-task]`, while `methods()` serves
`tasks/get`, `tasks/result`, `tasks/cancel`, and `tasks/list` over an in-memory
store. It demonstrates the interceptive seam; the augmented call returns a
`CallToolResult` rather than `CreateTaskResult` because the `tools/call` result
schema admits only `CallToolResult | InputRequiredResult` (TODO L56). Also add
the negotiation-plumbing tests shared by both extensions.
Wire runnable `apps` and `tasks` stories (in-memory + http-asgi) into the manifest
and document the extensions API in the migration guide.
Drop the public `MCPServer.add_extension`; extensions are fixed at construction
via `extensions=[...]` (the apply logic moves to a private `_apply_extension`,
with the `tools/call` interceptor composed once afterwards). This matches the
declarative design and removes the mid-connection mutation footgun. Rework the
tasks story around a `render_report` tool whose multi-step work motivates running
it as a task, with named `_start_task` / `_get_task` / `_task_result` helpers so
the client reads as a clear lifecycle.
Make explicit that a plain tools/call is unchanged - only a call carrying a
`task` field becomes a task - and document that per-tool gating on the declared
`ToolExecution.task_support` is not enforced by this reference extension.
# Conflicts:
#	src/mcp/server/mcpserver/__init__.py
#	src/mcp/server/mcpserver/server.py
The Tasks implementation was built against the 2025-11-25 in-core design still
carried (types-only) in mcp_types, not SEP-2663 (the extension that ships in
2026-07-28). They diverge on nearly every wire-observable detail: SEP-2663 makes
the server the sole decider (ignoring the legacy params.task), uses the
{tasks/get, tasks/update, tasks/cancel} method set (no tasks/list or
tasks/result), returns a CreateTaskResult discriminated by resultType: "task"
(not a CallToolResult with _meta), advertises {} settings, gates on
execution.taskSupport, and renames ttl/pollInterval to ttlMs/pollIntervalMs.

Remove the extension, its tests, and its story rather than ship a spec-violating
example; restore tasks to the deferred manifest list with a SEP-2663 pointer. The
generic Extension API and the Apps reference extension are unaffected and still
at 100% coverage. Tasks returns as a separate PR rewritten to SEP-2663 with the
conformance tasks-* scenarios wired in.
…Apps fixes

Framework:
- Move the Extension base class from mcp/server/mcpserver/extension.py to
  mcp/server/extension.py so helper-tier modules (apps.py) and third-party
  extensions depend on the base, not the composition tier.
- Enforce a vendor-prefix/name identifier via __init_subclass__ (and at apply
  time for per-instance identifiers), failing at class-definition rather than
  late with AttributeError.
- Add MethodBinding.protocol_versions so an extension method can be scoped to
  specific wire versions; out-of-range requests get METHOD_NOT_FOUND.
- Add require_client_extension(ctx, identifier) raising the -32021 missing
  required client capability error with a requiredCapabilities payload.

Apps:
- client_supports_apps now checks the client advertised the
  text/html;profile=mcp-app MIME type, not just the extension key.
- Add a visibility kwarg to @apps.tool (_meta.ui.visibility).
- Let add_html_resource set csp/permissions/domain/prefers_border on the
  resource _meta via typed ResourceCsp/ResourcePermissions models.
- Fix the meta= double-keyword TypeError by making meta an explicit param
  merged with the ui entry instead of passing through **tool_kwargs.
Implement io.modelcontextprotocol/tasks per SEP-2663 (Final), wire-incompatible
with the 2025-11-25 in-core design still carried (types-only) in mcp_types, so the
extension defines its own SEP-2663-shaped models:

- The server decides task augmentation per request; the legacy params.task field
  is ignored. Only a client that declared the extension on a modern (2026-07-28)
  connection is augmented - a legacy handshake cannot carry the capability, so it
  is never augmented.
- A task-augmented tools/call returns a flat CreateTaskResult (resultType: "task",
  taskId/status/createdAt/lastUpdatedAt/ttlMs).
- tasks/get returns a DetailedTask (resultType: "complete"); a completed task
  inlines the original CallToolResult. isError: true is a completed task (failed
  is reserved for JSON-RPC errors).
- tasks/cancel is an empty ack. tasks/result is not registered, so it returns
  -32601. A tasks/* call from a non-declaring client returns -32003 with a
  requiredCapabilities payload. Task ids are entropy-bearing.

Ships a runnable tasks story (server-decided augmentation + tasks/get polling) and
a migration note. Deferred to follow-ups (each needs deeper SDK plumbing):
tasks/update + the MRTR input_required loop, ToolExecution.taskSupport gating with
-32021, notifications/tasks, and SEP-2243 task routing headers.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 9 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/stories/tasks/README.md">

<violation number="1" location="examples/stories/tasks/README.md:3">
P3: Docs are inconsistent: the Tasks story is documented as implemented/runnable here, but the stories index still labels `tasks/` as "not yet implemented".</violation>

<violation number="2" location="examples/stories/tasks/README.md:11">
P2: README run instructions imply stdio demonstrates tasks flow, but stdio cannot negotiate the tasks extension. Users running the default command will not see the documented task behavior.</violation>
</file>

<file name="src/mcp/server/tasks.py">

<violation number="1" location="src/mcp/server/tasks.py:170">
P2: A failing tool call leaves a pre-created task permanently stored, causing in-memory task leaks on error paths.</violation>

<violation number="2" location="src/mcp/server/tasks.py:171">
P2: `intercept_tool_call` drops valid `BaseModel` results to `{}` before persisting task output. In extension chains this can make `tasks/get` return an empty `result` instead of the real tool payload.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

## Run it

```bash
# stdio (default — the client spawns the server as a subprocess)

@cubic-dev-ai cubic-dev-ai Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: README run instructions imply stdio demonstrates tasks flow, but stdio cannot negotiate the tasks extension. Users running the default command will not see the documented task behavior.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At examples/stories/tasks/README.md, line 11:

<comment>README run instructions imply stdio demonstrates tasks flow, but stdio cannot negotiate the tasks extension. Users running the default command will not see the documented task behavior.</comment>

<file context>
@@ -1,24 +1,48 @@
+## Run it
+
+```bash
+# stdio (default — the client spawns the server as a subprocess)
+uv run python -m stories.tasks.client
+
</file context>
Suggested change
# stdio (default — the client spawns the server as a subprocess)
# stdio (legacy handshake only; cannot negotiate `io.modelcontextprotocol/tasks` yet)
Fix with cubic

Comment thread src/mcp/server/tasks.py
now = self._clock()
task = self._store.create(now, self._default_ttl_ms)
result = await call_next(ctx)
payload = result if isinstance(result, dict) else {}

@cubic-dev-ai cubic-dev-ai Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: intercept_tool_call drops valid BaseModel results to {} before persisting task output. In extension chains this can make tasks/get return an empty result instead of the real tool payload.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/server/tasks.py, line 171:

<comment>`intercept_tool_call` drops valid `BaseModel` results to `{}` before persisting task output. In extension chains this can make `tasks/get` return an empty `result` instead of the real tool payload.</comment>

<file context>
@@ -0,0 +1,232 @@
+        now = self._clock()
+        task = self._store.create(now, self._default_ttl_ms)
+        result = await call_next(ctx)
+        payload = result if isinstance(result, dict) else {}
+        # A tool result (even isError: true) is a completed task; `failed` is for
+        # JSON-RPC errors, which surface as a raised MCPError, not a result here.
</file context>
Fix with cubic

Comment thread src/mcp/server/tasks.py
return await call_next(ctx)
now = self._clock()
task = self._store.create(now, self._default_ttl_ms)
result = await call_next(ctx)

@cubic-dev-ai cubic-dev-ai Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: A failing tool call leaves a pre-created task permanently stored, causing in-memory task leaks on error paths.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp/server/tasks.py, line 170:

<comment>A failing tool call leaves a pre-created task permanently stored, causing in-memory task leaks on error paths.</comment>

<file context>
@@ -0,0 +1,232 @@
+            return await call_next(ctx)
+        now = self._clock()
+        task = self._store.create(now, self._default_ttl_ms)
+        result = await call_next(ctx)
+        payload = result if isinstance(result, dict) else {}
+        # A tool result (even isError: true) is a completed task; `failed` is for
</file context>
Fix with cubic

`resultType: "task"` envelope, `execution.taskSupport` gating, and `ttlMs`
fields — so it lands in a separate PR with the conformance `tasks-*` scenarios
wired in.
Task-augmented execution (SEP-2663). A client declares the

@cubic-dev-ai cubic-dev-ai Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Docs are inconsistent: the Tasks story is documented as implemented/runnable here, but the stories index still labels tasks/ as "not yet implemented".

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At examples/stories/tasks/README.md, line 3:

<comment>Docs are inconsistent: the Tasks story is documented as implemented/runnable here, but the stories index still labels `tasks/` as "not yet implemented".</comment>

<file context>
@@ -1,24 +1,48 @@
-`resultType: "task"` envelope, `execution.taskSupport` gating, and `ttlMs`
-fields — so it lands in a separate PR with the conformance `tasks-*` scenarios
-wired in.
+Task-augmented execution (SEP-2663). A client declares the
+`io.modelcontextprotocol/tasks` extension; the server may then answer a
+`tools/call` with a `CreateTaskResult` (carrying a task id) instead of blocking,
</file context>
Fix with cubic

Comment thread src/mcp/server/tasks.py
Comment on lines +52 to +53
MISSING_REQUIRED_CLIENT_CAPABILITY = -32003
"""JSON-RPC error code: a `tasks/*` call from a client that did not declare the extension."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MISSING_REQUIRED_CLIENT_CAPABILITY redefined as -32003 — schema (and the rest of the SDK) use -32021

The core schema defines MISSING_REQUIRED_CLIENT_CAPABILITY = -32021 (schema.ts L418), and the SDK's own mcp_types.MISSING_REQUIRED_CLIENT_CAPABILITY, the generated MissingRequiredClientCapabilityError.code: Literal[-32021], require_client_extension(), and the shared/inbound.py HTTP-status map all use -32021 for this exact semantic. SEP-2663's prose still prints -32003 in three places, but that predates the spec PR #2907 error-code renumber and was never updated — the schema is the source of truth for codes.

Import the constant from mcp_types (or just call require_client_extension) so _require_tasks_capability() is consistent with the rest of the SDK and the conformance harness; update the module docstring, docs/migration.md, and the test that asserts this code.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +170 to +174
result = await call_next(ctx)
payload = result if isinstance(result, dict) else {}
# A tool result (even isError: true) is a completed task; `failed` is for
# JSON-RPC errors, which surface as a raised MCPError, not a result here.
self._store.complete(task.task_id, self._clock(), payload)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InputRequiredResult is augmented into a task and misreported as completed

An InputRequiredResult reaching the interceptor is stored via _store.complete(...) and reported by tasks/get as status: "completed" with the InputRequiredResult inlined as result — but CompletedTask.result for a tools/call task must be a CallToolResult (SEP-2663 §Task Status).

Per SEP-2663 §Task Creation, MRTR exchanges SHOULD resolve synchronously before CreateTaskResult — task status: "input_required" is a separate inputRequests/tasks/update mechanism, not a mapping for InputRequiredResult. So even with the tasks/update loop deferred, the interceptor should inspect payload.get("resultType") and pass an "input_required" result through unchanged (no task created), letting the MRTR loop run on the original request and augmenting only the eventual CallToolResult.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +130 to +132
def cancel(self, task_id: str, now: str) -> None:
task = self._tasks[task_id]
self._tasks[task_id] = task.model_copy(update={"status": "cancelled", "last_updated_at": now})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tasks/cancel clobbers terminal status and drops the completed result

tasks/cancel unconditionally overwrites status to "cancelled" with no terminal-state guard, so a completed task regresses to cancelled and the subsequent tasks/get drops the already-computed result (it only inlines on status == "completed"). In this inline core every reachable task is already terminal when tasks/cancel arrives, so this is the handler's only reachable behavior — and test_cancel_task_... snapshots the broken transition.

Terminal statuses (completed/failed/cancelled) should be immutable: SEP-2663 §Cancellation makes cancel cooperative and ack-only, and explicitly allows a task to "reach a terminal status other than cancelled if the work finished before cancellation could take effect." Guard in _handle_cancel (or TaskStore.cancel) and return the empty ack without mutating a terminal task.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +61 to +62
def _fixed_clock() -> str:
return "1970-01-01T00:00:00Z"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default clock is a fixed epoch stub, not a real wallclock

The default clock is _fixed_clock, a hard-coded "1970-01-01T00:00:00Z" stub — so every server using the documented Tasks() / Tasks(default_ttl_ms=...) emits the Unix epoch for createdAt/lastUpdatedAt on the wire, breaking client TTL-expiry math and making the SEP-2663 timestamp fields useless.

The default should be a real UTC wallclock (e.g. datetime.now(timezone.utc).isoformat()); the fixed clock belongs only in tests. test_create_task_result_uses_default_clock_when_none_injected currently pins the wrong behaviour and will need updating.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +168 to +174
now = self._clock()
task = self._store.create(now, self._default_ttl_ms)
result = await call_next(ctx)
payload = result if isinstance(result, dict) else {}
# A tool result (even isError: true) is a completed task; `failed` is for
# JSON-RPC errors, which surface as a raised MCPError, not a result here.
self._store.complete(task.task_id, self._clock(), payload)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interceptor mishandles exceptional and non-dict call_next outcomes

intercept_tool_call mishandles non-dict and exceptional call_next outcomes (the carry-forward from #3003):

  • _store.create() runs before await call_next(ctx) with no try/except, so when the tool call raises (MCPError from an unknown tool or a handler, a ValidationError, cancellation) the task entry leaks in the store at status="working" forever, and the failed status the docstring reserves for JSON-RPC errors is unreachable.
  • payload = result if isinstance(result, dict) else {} silently discards a BaseModel (or None) returned by a nested extension interceptor — HandlerResult is BaseModel | dict | None — so the completed task inlines "result": {}.

Wrap call_next in try/except to transition the task to failed (or drop the entry) on error, and serialize a BaseModel result with model_dump(by_alias=True, mode="json", exclude_none=True) (mirror _dump_result in runner.py) instead of replacing it with {}.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +110 to +138
class TaskStore:
"""In-memory record of tasks and their completed `CallToolResult` payloads."""

def __init__(self) -> None:
self._tasks: dict[str, Task] = {}
self._results: dict[str, dict[str, Any]] = {}

def create(self, now: str, ttl_ms: int | None) -> Task:
# Task IDs are bearer capabilities for tasks/get|cancel, so they need
# entropy a third party cannot guess or enumerate (SEP-2663 security).
task_id = f"task_{secrets.token_urlsafe(16)}"
task = Task(task_id=task_id, status="working", created_at=now, last_updated_at=now, ttl_ms=ttl_ms)
self._tasks[task_id] = task
return task

def complete(self, task_id: str, now: str, result: dict[str, Any]) -> None:
task = self._tasks[task_id]
self._tasks[task_id] = task.model_copy(update={"status": "completed", "last_updated_at": now})
self._results[task_id] = result

def cancel(self, task_id: str, now: str) -> None:
task = self._tasks[task_id]
self._tasks[task_id] = task.model_copy(update={"status": "cancelled", "last_updated_at": now})

def get(self, task_id: str) -> Task | None:
return self._tasks.get(task_id)

def result(self, task_id: str) -> dict[str, Any] | None:
return self._results.get(task_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] TaskStore grows unboundedly; ttlMs never enforced

TaskStore only ever inserts into _tasks/_results and never removes entries, and the single Tasks instance lives for the server's lifetime — so every augmented tools/call permanently retains its full CallToolResult payload. ttl_ms is stamped on the wire but never enforced server-side, so Tasks(default_ttl_ms=60_000) as shown in the story is advisory-only.

For a long-running HTTP server this is unbounded growth (the store-growth/cleanup concern carried forward from #3003). Worth at least a TTL-based eviction on tasks/get/create, a size bound, or an explicit "no eviction — deferred" note in the module docstring alongside the other deferrals.

AI Disclaimer

Comment thread src/mcp/server/tasks.py
Comment on lines +151 to +155
def methods(self) -> Sequence[MethodBinding]:
return [
MethodBinding("tasks/get", GetTaskRequestParams, self._handle_get),
MethodBinding("tasks/cancel", CancelTaskRequestParams, self._handle_cancel),
]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] tasks/get / tasks/cancel not version-scoped; legacy clients get the missing-capability error instead of -32601

These bindings omit protocol_versions, so tasks/get/tasks/cancel are served on legacy (≤2025-11-25) connections too. There _require_tasks_capability returns the missing-capability error with a requiredCapabilities: {extensions: ...} payload the client structurally cannot satisfy — since the extension "only exists on the modern wire" (and SEP-2663 §Backward Compatibility says it "is not defined under the 2025-11-25 protocol version"), these should be -32601 instead. Add protocol_versions=frozenset(MODERN_PROTOCOL_VERSIONS) to both bindings.

AI Disclaimer

Base automatically changed from extension-api-sep-2133 to main June 29, 2026 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants